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Abstract — We consider a scenario where A'' users try to 
access a common base station. Associated with each user is its 
channel state and a finite queue which varies with time. Each 
user chooses his power and the admission control variable in 
a dynamic manner so as to maximize his expected throughput. 
The throughput of each user is a function of the actions and 
states of all users. The scenario considers the situation where 
each user knows his channel and buffer state but is unaware of 
the states and actions taken by the other users. We consider the 
scenario when each user is saturated (i.e., always has a packet to 
transmit) as well as the case when each user is unsaturated. We 
formulate the problem as a Markov game and show connections 
with strategic form games. We then consider various throughput 
functions associated with the multiple user channel and provide 
algorithms for finding these equilibria. 

Keywords: Multiple access channel, Stochastic games, Stationary 
policies. Strategic form games, Nash equilibria. Potential games. 

I. Introduction 

There has been a tremendous growth of wireless com- 
munication systems over the last few years. The success of 
wireless systems is primarily due to the efficient use of their 
resources. The users are able to obtain their quality of service 
efficiently in a time varying radio channel by adjusting their 
own transmission powers. Distributed control of resources is 
an interesting area of study since its alternative involves high 
system complexity and large infrastructure due the presence 
of a central controller 

Noncooperative game theory |[T1 is a natural tool to de- 
sign and analyze wireless systems with distributed control 
of resources. Scutari et al. Q, ID analyzed competitive 
maximization of transmission rate and mutual information 
on the multiple access channel subject to power and other 
constraints. Heikkinen |5| analyzed distributed power control 
problems via potential games while Lai et al. 1 2 1 applied game 
theoretic framework to resource allocation problem in fading 
multiple access channel. 

Altman et al. ||6l studied the problem of maximizing 
throughput of saturated users (a user always has a packet 
to transmit) who have a Markov modelled channel and are 
subjected to power constraints. They considered both the cen- 
tralized scenario where the base station chooses the transmis- 
sion power levels for all users as well as the decentralized 
scenario where each user chooses its own power level based 
on the condition of its radio channel. Altman et al. fl] 
later considerd the problem of maximizing the throughput 
of users in a distributed manner subject to both power and 
buffer constraints. The decentralized scenario in H while the 



distributed resource allocation problem in fl] was analyzed as 
constrained Markov games with independent state information, 
i.e., no user knows other user's state. The proof of existence of 
the equilibrium policies for such games was given in (E"]. An 
algorithm which guaranteed convergence to the equilibrium 
policies for two users for any throughput function of the 
two users and an algorithm which guaranteed convergence to 
the equilibrium policies for N users when their throughput 
functions are identical were provided in jO). 

Our work is closely related to the above mentioned work. 
When restricted to the objective functions in [6J, [TJ our 
problem is exactly the same however we present an alternative 
view of constrained Markov games with independent state 
information. With this view we connect the theory to strategic 
form games ifTOI . The existence of equilibrium policies follows 
directly from this viewpoint. This includes both the saturated 
as well as the unsaturated scenario considered in |6| and 
|7| respectively. We also show that the algorithm which 
guaranteed convergence to the equilibrium policies for users 
can be extended to cases where the throughput functions of 
the users may be different. 

Besides presenting an unified view of both the saturated 
as well as the unsaturated problem, we also consider the case 
where the base station uses a successive interfence cancellation 
rather than a regular matched filter Here we formulate both 
the non-cooperative and the cooperative (team problem) setup 
and find the equilibria for both the problems. 

The paper is structured as follows. In Section II we present 
the system model for both the saturated (no buffer constraints) 
and the unsaturated (both buffer and power constraints) sce- 
nario. In Section III we setup the problem as a constrained 
Markov game with independent state information and define 
the so called equivalent strategic form game. Here we provide 
a proof of existence of equilibrium policies and define the 
idea of a pure strategy and potential function for Markov 
games. In Section IV we consider various throughput functions 
associated with the multiple access channel. In Section V 
we develop algorithms to compute these equilibrium policies. 
Section VI concludes the paper. 

II. System Model 

We consider a scenario where a set N = {1, --,A^} of 
users access the base station through a channel simultaneously. 
Time is divided into slots. The channel for user i is modelled 
as an ergodic Markov chain ki[n] taking values from a finite 
index set = {0, 1, 2, • • ■ , fc^^}. The channel gain for user 



i in index ki is hi{ki) where function hi : Kj i — > [0 1]. We 
assume hi{0) = 0. 

The transition probabilty of user i going from channel index 
ki to k^ is P^ k'- assume that in each time slot each 
user knows his channel index perfectly but does not know the 
channel index of the other users. Each user has a set of power 
indexes = {0, 1, 2, • • • , 1^} where is the largest power 
index. The power invested by user i at time n is given by the 
function pi : Lj i — > M with the property that Pi{0) = 0, i.e., 
there is no power invested by user i at power index U = 0. 
Let li[n] represent the power index followed by user i. 

For the unsaturated case each user has a queue of finite 
length qj". Denote Q, = {0, 1, 2, • • • , q]^]. Let 7,[n] packets 
arrive in the queue at time slot n from the higher layers where 
{7j[n], n > 0} are independent and identically distributed (iid) 
with distribution r. In each time slot a user may transmit 
atmost one packet from its queue if it is not empty. Let 
di[n] e Dj = {0,1} be the admission control variable for 
user i where di[n] = 1 denotes accepting all packets from 
the upper layer and di[n] = denotes rejecting all packets. 
The inconning packets are accepted untiU the buffer is full, the 
remaining packets are dropped. We assume that a user has no 
information about the queues of other users. If qi [n] and wi [n] 
denote the number of packets in the queue and the number of 
departures from the buffer in slot n then the queue dynamics 
are given as, 

qi[n + 1] = min([gi[n] + (ii[n]7i[n] - q^J. (1) 

In time slot n the state Xi [n] and the action [n] of user i 
is defined as, 

Xi [n] = {ki [n] , qi [n] ) , [n] = {k [n] , Ci [n] ) . (2) 

The set of states Xj and the set of actions Aj of user i are 
denoted as = x and = L; x respectively. The 
set of states (actions) other than that of user i is denoted as 
X_i (A_i) while the set of all states (actions) of all users is 
denoted as X (A) respectively. In the following we will present 
the details for the unsaturated case and then comment briefly 
for the saturated case. 

A. Instantaneous throughput and cost for user i: 

The throughput obtained by user i is given by the function 
: K X L I — >R+ satisfying U{k, if A:,; = or = 
where K = i ^ = ni=iAi- This imphes that 

the throughput obtained by user i is if the channel is very 
bad or there is no power invested by the user. Note that the 
throughput of user i depends on the global channel index k 
and global power index I of all users. We define the throughput 
{ti) and the cost {c-) of user i at time n as, 

ti{x[n\,a[n\) = ti{k[n\, l{g,[„]^o} • h[n\;i & N), (3) 
cl{x[n\,a[n\) = Pi{ki[n]), cf{x[n\,a[n]) = qi[n], (4) 

where 1a represents the indicator function and is 1 if event A 
is true. We observe that there is a power cost and a queuing 



cost for user i due to stringent delay requirements which have 
to be met by the user. 

B. Transition probabilty under each action: 

We define the transition probability ^ x ^^^^ * going 

^ * * I 

from state Xi to state x^ under the action as, 

P t = P , ■ p , (5) 

XiaiX ■ kik- qiaiq.^ ^ -' 

where P^ .^ q' is the transition probability of user i going from 
state qi to state under action a^. 

C. Saturated system: 

In the saturated system each user always has a packet to 
transmit at each time. Thus there is only a power cost for every 
user. Th state, action and transition probability of user i get 

modified as, Xi[n\ = fc^n], «,:[«.] = k[n],and P^.^^x'. = ^kik' 
while the instantaneous throughput and cost for user i are 
ti{x[n],a[n]) = ti{k[n],l[n]) and c,^(a;[n], a[n]) = Pi{ki[n]) 
respectively. 

D. Stationary policies: 

Let Mi{G) be the set of probabilty measures over a set G. A 
stationary policy for user i is a function Ui : Xj i — > Mi(Aj). 
The value Ui(ai\xi) represents the probability of user i taking 
action Ui when it is in state Xj. We denote the set of stationary 
policies for user i as Uj and the set of all stationary multi- 
policies as U = YliLi Uj- The set of stationary multipolicies 
of all users other than user i is denoted as U_j. 

E. Expected time-average rate, costs and constraints: 

Let xq := a;[0] represent the initial state of all users. 
Given a stationary multipolicy u for all players, P^° denotes 
the distribution of the stochastic process a[n]). The 

expectation due to this distribution is denoted as E^". We now 
define the time-average expected rate as, 

1 

Ti{u) :=limsup- VE^''(fi(x[n],a[n])). (6) 

where the expected time average costs are subject to con- 
straints. 




where c\ = Pi and c\ = Q^. In case of the saturated scenario 
k = \ otherwise k G {1,2}. A policy is called i— feasible 
if it satisfies C\{ui) < M k and is called feasible if it is 
«-feasible for all users i € N. 



III. Game theoretic formulation 

Each user chooses a stationary poHcy Ui e so as to 
maximize his expected average reward Ti{u). However Ti{u) 
depends on the stationary policy of other users also leading to 
a noncooperative game. We denote the above formulation as 
a constrained Markov game (HI, lITTI . 



cmg 



N,(X,),(A,),(P.),(i.),(cf),(Cf) 



where the elements of the above tuple are as defined previ- 
ously. Let [u-i, Vi] denote the multipolicy where, users k ^ i 
use stationary policy while user i uses policy We now 
define the Constrained Nash Equillibrium (CNE). 

Definition 1: A multipolicy li G U is called a CNE if for 
each player i G N and for any Vi G Ui such that [u^i^Vi] is 
feasible. 



Ti{u) > Ti{u_i,Vi). 



(8) 



A I— feasible policy Ui is called an optimal response of 
player i against a multipolicy of other users if for any 
other i— feasible policy Vi , (13) holds. 

In this paper we limit ourselves to stationary CNE as against 
general history dependent Nash equilibria. These are easy to 
implement and are usually the subject of study. It is shown in 
||8l that stationary Nash equilibria are Nash equilibria in the 
general class of policies also although may only be a proper 
subset. 

A. Calculation of optimal response 

Denote the transition probability of user i going from state 
Xi to state Ui under the policy Ui as. 



PxiUiyi — / ^ '^iiS^i\'^i}Pxiaiyi- 



(9) 



Define the immediate reward for user i, when user i has state 
Xi and takes action and other users use multipolicy u-i as, 

R,{x^,a.i)^ ^ \^ui{ai\xi)T:'^'{xi)]U{x,a), (10) 

where 7r"'(a::/) is the steady state probability of user I being 
in state xi when it uses policy ui. 

Given the stationary policy Ui G Ui define the occupation 
measure as. 



Zi{xi,ai) = Tr'^'ixi) ■ Ui{at\xt). 



(11) 



The occupation measure Zi{xi,ai) for user i is the steady-state 
probability of the user being in state Xi G Xi and using action 
Ui G Ai . Given the occupation measure Zi the stationary policy 
Uj is: 



Ut{ai\xi) 



\Xi , fti ) G X, X A,. (12) 



Then the time-average expected rate and costs under the 
multipolicy u are: 



Ti{u) ^ ^ Ri{xi,ai)zi{xi,ai), 



Ci{ui)= ^ c'l{xi,ai)zi{xi,t 

{xi,ai) 



(13) 



(14) 



B. Best response of player i 

Let all users other than user i use the multipolicy u_i. Then 
user i has an optimal stationary best response policy which is 
independent of the initial state xq JH). Let the set of optimal 
stationary policies of user i be denoted as BR(w_i). We can 
compute the elements of this set from the following Linear 
program: 

Find z* — [z*{xi, Ui)], {xi, Oi) G Xi x Ai that maximizes: 



Ti{u) = ^ Ri{xi,ai)zi{xi,ai), 

{xi.ai) 



(15) 



subject to 

WA^i) - Pv^a,x,]zi{xi,ai) ^Q^Myi^Xi, (16) 

{Xi ,ai) 

= cUx^,a,)z,{x„a,) < Cf, Vfc G {1,2}, 

(xi.ai) 

(17) 

Zi{xi,ai) ^ 1, Zi(a;i,ai) > 0, \/{xi,yi) G Xi x Yi. 

(18) 

Note that the above Linear program can be modified for the 
saturated scenario simply by choosing fc = 1. The Linear 
program for the saturated scenario can be presented in a much 
simpler form [6|. The constraints (21 — 23) are referred in 
matrix form as A]^^ ■ Zi < bus- 

C. Equivalent Strategic form game: 

In this section we will show that the above Markov game 
is equivalent to a usual strategic form (nonstochastic) game. 
We will use this equivalence to show existence of the CNE 
and also provide algorithms to find them and show their 
convergence. 

Define a Strategic form game Te = (N, {Vi} -g]>^, {r-i},;^]*^) 
where Vi := {1, 2, • • • , v™}. Each point Vi G Vi corresponds 
to the endpoint [zi{xi,ai)];{xi,ai) G Xi x Ai of the poly- 
hedron formed due to constraints A^^, ■ Zi < bus and will be 
denoted as Vi :— [ui(a;i, Oi)]; (a;i, Oi) G Xi x Ai. The utility 
function ri : V i — > R where V = JliGN ^» defined as, 

ri{v) = ri{vi,V2,- ■ ■ ,vn) := ^ R^ {xi, ai)vi{xi, ai), 

(xi,ai) 



where 



(19) 

Ri{xi,ai):^ ^ Y\_vi{xi,ai)ti{x,a). (20) 



Let Xi be a mixed strategy for player i. Denote the set of mixed 
strategies of player i as A(Vi). The expected utility of player 
i when all players use strategy tuple A = (Ai, A2, • • • , Ajv) 
is given as ri{X) :— E\{ri) where denotes expectation 

with respect to the global mixed strategy A. Define the set of 
optimal strategies for player i, when other players use strategy 
A_, as. 



Si?,(A_,;) = |a* : A* e argmaxXiniXi, X-i)Y 



(21) 



D. Existence of Nash Equillibrium 

The following proposition establishes a connection between 
any global multipolicy u for the constrained Markov game 
Tcmg and some global mixed strategy A in the equivalent 
strategic form game Te- 

Proposition 1: There exist a u* e BR{u^i) given any 
multipolicy u-i of players other than i if and only if there 
exist X-i for players other than i and a A* G BR{X^i) such 
thatr,;«,u_,)-»^»(A*,A_,) 

Proo/- Refer to [llj. ■ 

The existence of CNE for the constrained Markov game 
Tcmg follows from the above proposition. 

Theorem 1: There exist a CNE for the Constrained Markov 
game r„„g. 

Proof: There exist a mixed strategy Nash equilib- 
rium for the equivalent strategic form game Ve let 
it be denoted by A*. It follows then, that ri(A*,Alj) > 
?'j(Ai,Alj), V Ai, V i e N. From proposition 1 we can find 
equivalent u for A such that Tt{u*,u*_^) = rj(A*,Alj) > 
ri{X^,X*_,) = Ti{ui,u*_^), V 6 Uj, V i £ N. This proves 
that u* is a CNE. ■ 

E. Potential Games 

We first define the idea of a pure strategy and pure startegy 
Nash equilibrium (PSNE) for the constrained Markov game 
r 

^ cmg- 

Definition 2: A policy Ui for player i is called a pure policy 
or pure strategy of the constrained Markov game Tcmg if the 
mixed strategy A corresponding to this policy is a pure strategy. 
We say that a constrained Markov game Tcmg has a PSNE if 
the equivalent startegic form game has a PSNE. 

Definition 3: A strategic form game T is called a potential 
game if there exists a function r : V 1 — !■ M such that V i £ N, 

Yi, V V-i G V_i. Tcmg is a potential game if the 
corresponding is a potential game. 

Consider the class of strategic form games. 



S F(fc) = (N, {LJ^^N' Mk)}^eN) ^ ^ ^ ^ 



(22) 



Lemma 1: If F(fc) is a potential game for each fc £ K, then 
the constrained Markov game Tcmg is a potential game. 

Proof Refer to CI]. ■ 
Refer to an example in ifTTl . 



IV. Throughput functions 

The base station may use a regular matched filter or a 
successive interference cancellation (SIC) filter We assume 
that each user is aware of the filter adopted at the base station 
to decode their respective transmissions. Any of the two cases 
results in different throughput functions for the users which 
we characterize in the subsequent subsections. 

A. Regular matched filter 

When the base station uses a regular matched filter the 
received packet of any user is decoded by treating the signals 
of other users as noise. In this case, the throughput functions 
for user i is. 



hi(ki)pi{li) 



(23) 



Note that t™{k,l) is an upper bound for the throughput of 
user i. On the other hand the users may want to maximize 
the aggregrated throughput in a decentrahzed manner. In this 
case the joint objective function when they use action a G A 
at state x G X is. 



N 

E 



(24) 



The interference cancellation Markov game is Tcmg with 
ti — t™- and the sum throughput game is T^mg with ti — . 
The interference canccellation Markov game and the sum 
throughput Markov game are denoted as F*^^ and F^,„g 
respectively. These throughput functions were considered in 

Q, 0. 

B. Successive Interference Cancellation 

When the base station uses a successive interference can- 
cellation filter it decodes the data of users in a predefined 
order at each time slot. Given an ordering scheme on the the 
set of users N, the received packet of a user i is decoded 
after cancelling out the decoded transmission of other users 
lying below user i in the predefined order from the received 
transmission. We assume perfect cancellation of the decoded 
signal from the received transmission |7|. 

We first show how to choose the decoding order for each 
time slot. We define the "Endpoint SIC schemes" where the 
decoding order is fixed for all time slots. Now using the 
latter we define the "Randomized SIC schemes" where the 
decoding order for each time slot is chosen randomly from 
some distribution. We assume that the distribution is known 
to all users but they do not know the decoding order at each 
time slot. 

1 ) Endpoint SIC schemes: Here the decoding order is same 
for each time slot n. Given the set of users N define the m- 
th permutation set of N as the ordered set crjv(r7i) where m 
represents one of the possible A^! permutation. Let 6^(771) 
denote the set of players who are indexed above user i in 



the set aNirn). We define the m-th utility function of user i 
as. 



hi{ki)Pi{k) 



Nn 



(25) 



The above utility function for player i indicates that all users 
indexed below user i in the set aN{m) are decoded before user 
i and their signal is cancelled out from the received signal, 
after which, user i signal is decoded. The m-th endpoint SIC 
Markov game is Tcmg with ti — t™ V i G N and is denoted 

2) Randomized SIC schemes: Here the decoding order is 
chosen at each time slot n with a probability. Though each 
user knows the probability distribution at each time slot n, he 
does not know the exact decoding order If probabilty mass 
function a = {a{m)} over the set N! = {1, 2, • • • , A^!} is 
chosen then the utility function of user i as. 



tt{k,l)=Y^a{m)tT{k,l). 



(26) 



Note that the randomizations 
1 for some m corresponds to the endpoint 
game ^^g- In the next subsection we find randomizations a, 
for which r° 



The a randomized SIC Markov game is Tcmg with t. 
V i G N and is denoted as r",„g. 
a such that a{m) 



crag has a pure strategy Nash equilibrium. 
3) Randomized games with PSNE: In this sction we con- 
struct randomizations a for which the resulting randomized 
games have PSNE's. Take a partition Pi, P2, ■ ■ ■ , Pk of the 
set N where 1 < A; < A^. Let s{pa) = Fi, P2, • • • , -Pfc denote 
this particular partition of the set N where pa indexes this 
particular partition of the set N. 

Let s{pa,Pe) = {PeiPe2 ■ ■ ■ Pek) dcnotc the ordered 
set formed by the Pe-ih permutation of the partitions 
Pi,P2, - ■ ■ , Pk- Note that 1 < pe < k\. Define the Support 
set S{pa,Pe) as, 

S{Pa,Pe) ■■ = 

m : (JN{m) = (Tp^^{mi)ap^^{m2) ■ ■ ■ ap^^{mk) 
VI < mi < |Pei|! ,■••,!< mfe < |Pefe|!|. 

where acini) refers to the m-th permutation of the set G. The 
set S{pa,Pe) contains all the permutations m for which the 
randomization a (to be defined next) has a positive value, 
i.e a{m) > V m G S{pa,Pe)- We now define the 
randomization a{pa,Pe) as. 



a(rn) 



|Pi|!|P2|!- 

; 



m e S{pa,Pe) 

Otherwise. 



(27) 



The following example shows the construction for N = 

{1,2,3} 

Example 1: N = {1,2,3}. The permutation sets of N are 

ajv(l) = (1,2,3), ajv(2) = (1,3,2), ajv(3) - (2,1,3), 
ajv(4) = (2, 3, 1), ajv(5) = (3, 1, 2) and ajv(6) = (3, 2, 1). 



The possible partitions of the set N are s(l) = 
{1},{2},{3}, s{2) = {1,2}, {3}, s(3) = {1,3}, {2}, .(4) = 
{3,2},{1} and s(5) = {1,2,3}. The ordered set formed 
due to the corresponding permutations of the partitions are 
s(l,l) = ({1}{2}{3}), s(l,2) = ({1}{3}{2}), .(1,3) = 
({2}{1}{3}), .(1,4) = ({2}{3}{1}), .(1,5) = ({3}{1}{2}), 
.(1,6) = ({3}{2}{1}), .(2,1) = ({1}{2,3}), .(2,2) = 
({2,3}{1}), .(3,1) = ({2}{1,3}), .(3,2) = ({1,3}{2}), 
.(4,1) = ({3}{1,2}), .(4,2) = ({1,2}{3}) and .(5) = 
({1,2,3}). 

The support sets resulting from the above ordered sets are 
5(1,1) = {1}, 5(1,2) = {2}, 5(1,3) = {3}, 5(1,4) = {4}, 
5(1,5) = {5}, 5(1,6) = {6}, 5(2,1) = {1,2}, 5(2,2) = 
{4,6}, 5(3,1) = {3,4}, 5(3,2) = {2,5}, 5(4,1) = {5,6}, 
5(4,2) = {1,3} and 5(5,1) = {1,2,3,4,5,6} The above 
support set lead to the following randomizations: 

TABLE I 
Randomizations with PSNE 





a(l) 


a(2) 


a{3) 


a(4) 


"(5) 


Q(6) 


a(l,l) 


1 


0) 














"(1,2) 





1 














a(l,3) 








1 











a(l,4) 











1 








a(l,5) 














1 





a(l,6) 

















1 


a{2,l} 


1/2 


1/2 














a(2,2) 











1/2 





1/2 


a(3,l) 








1/2 


1/2 








a(3, 2) 





1/2 








1/2 





a(4, 1) 














1/2 


1/2 


a(4, 2) 


1/2 





1/2 











a(5, 1) 


1/6 


1/6 


1/6 


1/6 


1/6 


1/6 



The next theorem shows that the randomizations constructed 
in this section lead to games which have PSNE's. 

Theorem 2: Any Markov game r"^^ with a — a{pa,Pe) 
has a pure strategy Nash equilibrium. 

Proof: Refer to HIl. ■ 

C. Sum Capacity utility function 

We define the sum capacity utility function as. 



t^^(fc,0=log2 1 + 



Y!i=i hi{ki)pi{U) 
No 



(28) 



For any probabilty distribution a we have. 



N 
i=l 

We can interpret the sum capacity utility function as the 
aggregrated sum throughput that each user maximizes in 
a decentralized manner when the base station is using a 
SIC decoder. The Sum capacity Markov game is Tcmg with 
ti = t""" V i e N and is denoted as Tcmg. 

V. Algorithms 
In this section we give the algorithms to compute the 
CNE for the Markov games r» ^, L- T^cmg and T^^g 
whenever a = a{pa,Pe) for some partition s{pa) of N and 



permutation pe of the partition sets. Algorithm 1 is used to 
compute the Nash equilibrium for the first three Markov games 
while algorithm 2 is used to compute the equilibrium for the 
randomized game T"^g. Note that algorithm 1 was considered 
in fSl and its proof for identical interest throughput functions 
(i.e., r^^g) was also given. We extend the proof for r^„jg. 



Algorithm 1 

Initialize multipolicy vP ^ \i 
for all 1 < i < TV do 

Compute u\ G BR{u^i) by solving the LP 
using the simplex algorithm where ~ 
(utul---,uU,u1'\---,u%-'). 
if T,{u\,u^,)=T,{u^~^,u^,) then 

then the updated value u\ :— u'l~^ 
end if 
end for 

if u*^ = u''~^ then 

stop, else go to step 2 
end if 

u'' is the CNE 



We define the restriction of Tcmg which is used in algorithm 
2. Given any set S C N of users and policy for all 
i G N/S, we define the restriction of Fcmg on the set S as the 
constrained Markov game with the set S of users participating 
in the game Fcmg while the users i G N/S use the predefined 
policy u^. We denote the restricted game as Tcmg{S). 

Let s{pa,Pe) — PeiPe2 ■ ■ ■ Pek denote the ordered set 
formed by the Pe-ih permutation of the partition s{pa) = 
Pi,P2,---,Pk- We compute the PSNE for the game V^^-^g 
induced by the partition pa and permutation pe . 



Algorithm 2 

Initialize multipolicy u'-^ G U 
for all 1 < J < fc do 

if user i G Pel where I < j then 
Set Ui — u* 

end if 

if user i G Pel where I > j then 

Set u, = u° 
end if 

Compute u'H for all i G Pej by restricting algorithm 1 
on the restricted Markov game TemgiPej)- 
Set u* = for all i G Pej ■ 
end for 

u*, i G N is the required PSNE. 



The convergence of algorithms 1 and 2 is proved in ifTTIl . 

VI. Numerical results 

The channel model considered is the BF-FSMC model 
Q: The channel transition probabilities are Pq.o = 1/2, 
Po.i - 1/2, Pki^.k^-i = 1/2, Pk^,k^ = 1/2 ;Pk^,k, - 1/3, 
Pk,M-i = 1/3, Pk,M+i = 1/3 (1 < fc. < - 1)- The 



channel gain and the power function are hi — ki/{kY^) and 
Pi = li respectively. 

The following parameters are fixed for all user: fc™ = 3, 
= 5, 10, P^ = 2 and = 5. ji[n\ has a Poisson 

distribution with rate .3 and iVo = 1. The throughput obtained 
at the equilibria for the various games are tabulated in Table 
|ll]in the user order {1, 2, 3}. Note that the randomized game 
a(2, 1) has multiple equilibria. Please refer to fTTl for the 
optimal policies. : 



TABLE II 
Optimal User Throughput 



Game / System Model 


Saturated 


Unsaturated 


-pin 
^ cmq 


.5263, .5263, .5263 


.4649, .4649, .4649 


cmg 

a = a{l, 1) 


1.0644, .6969, .5068 


.6949, .5649, .4649 


-pa. 

cmg 

a = a(4, 2) 


.8836, .8836, .5082 


.6299, .6299, .4649 


pa. 

cmg 

a = ck(5, 1) 


.7566, .7566, .7566 


.5749, .5749, .5749 


pa 

cmg 

a = a(2, 1) 


1.0644, .6035, .5987 
1.0644, .5987, .6035 


.6949, .5149, .5149 


pa 

cmq 


1.6139 


1.3959 


psc 

cmq 


2.2789 


1.7246 
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VII. Conclusions 

We have considered decentralized scheduling of a Wireless 
channel by multiple users. The users may be saturated or 
unsaturated. The decoder at the base station may employ 
a matched filter or successive interfernce cancellation. The 
users know only their own channel states. The system is 
modelled as a constrained Markov game with independent 
state information. We have proved the existence of equilibrium 
policies and provided algorithms to find these policies. For 
this, we first convert the Markov game into an equivalent 
strategic form game. 
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